Search CORE

668 research outputs found

Clustering with shallow trees

Author: A Braunstein
A Flaxman
Altschul S F
Bradde S Braunstein A Flaxman A Zecchina R
L Foini
M Bailly-Bechet
R Zecchina
S Bradde
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

We propose a new method for hierarchical clustering based on the optimisation of a cost function over trees of limited depth, and we derive a message--passing method that allows to solve it efficiently. The method and algorithm can be interpreted as a natural interpolation between two well-known approaches, namely single linkage and the recently presented Affinity Propagation. We analyze with this general scheme three biological/medical structured datasets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Recommended from our members

The perfect recovery? Interactive influence of perfectionism and spillover work tasks on changes in exhaustion and mood around a vacation

Author: Flaxman P.
Horan S.
Stride C.
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/04/2021
Field of study

This study examined week-level changes in affective well-being among school teachers as they transitioned into and out of a 1-week vacation. In addition, we investigated the interactive influence of personality characteristics (specifically perfectionism) and spillover work activities during the vacation on changes in teachers' well-being. A sample of 224 teachers completed study measures across 7 consecutive weeks, spanning the period before, during, and after a midterm vacation (providing a total of 1,525 responses across the study period). Results obtained from discontinuous multilevel growth models revealed evidence of a vacation effect, indicated by significant reductions in emotional exhaustion, anxiety, and depressed mood from before to during the vacation. Across 4 working weeks following the vacation, exhaustion and negative mood exhibited a nonlinear pattern of gradual convergence back to prevacation levels. Teachers with a higher level of perfectionistic concerns experienced elevated working week levels of exhaustion, anxious mood, and depressed mood, followed by pronounced reductions in anxious and depressed mood as they transitioned into the vacation. However, a strongly beneficial effect of the vacation was only obtained by perfectionistic teachers who refrained from spillover work tasks during the vacation. This pattern of findings is consistent with a diathesis-stress model, in that the perfectionists' vulnerability was relatively dormant (or deactivated) during a respite from job demands. Our results may provide an explanation for why engaging in work-related activities during vacations has previously exhibited weak relationships with employees' recovery and well-being

City Research Online

White Rose Research Online

Feature-to-feature regression for a two-step conditional independence test

Author: Filippi SL
Flaxman S
Sejdinovic D
Zhang Q
Publication venue
Publication date: 01/01/2017
Field of study

The algorithms for causal discovery and more broadly for learning the structure of graphical models require well calibrated and consistent conditional independence (CI) tests. We revisit the CI tests which are based on two-step procedures and involve regression with subsequent (unconditional) independence test (RESIT) on regression residuals and investigate the assumptions under which these tests operate. In particular, we demonstrate that when going beyond simple functional relationships with additive noise, such tests can lead to an inflated number of false discoveries. We study the relationship of these tests with those based on dependence measures using reproducing kernel Hilbert spaces (RKHS) and propose an extension of RESIT which uses RKHS-valued regression. The resulting test inherits the simple two-step testing procedure of RESIT, while giving correct Type I control and competitive power. When used as a component of the PC algorithm, the proposed test is more robust to the case where hidden variables induce a switching behaviour in the associations present in the data

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Bayesian kernel two-sample testing

Author: Filippi S
Flaxman S
Sejdinovic D
Wild V
Zhang Q
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2022
Field of study

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where applications are often restricted to univariate cases. Here, we propose a Bayesian kernel two-sample testing procedure based on modelling the difference between kernel mean embeddings in the reproducing kernel Hilbert space utilising the framework established by Flaxman et al (2016). The use of kernel methods enables its application to random variables in generic domains beyond the multivariate Euclidean spaces. The proposed procedure results in a posterior inference scheme that allows an automatic selection of the kernel parameters relevant to the problem at hand. In a series of synthetic experiments and two real data experiments (i.e. testing network heterogeneity from high-dimensional data and six-membered monocyclic ring conformation comparison), we illustrate the advantages of our approach

arXiv.org e-Print Archive

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: a winning solution to the NIJ "Real-Time Crime Forecasting Challenge"

Author: Chirico M
Flaxman S
Loeffler C
Pereira P
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 10/07/2019
Field of study

We propose a generic spatiotemporal event forecasting method, which we developed for the National Institute of Justice’s (NIJ) RealTime Crime Forecasting Challenge (National Institute of Justice, 2017). Our method is a spatiotemporal forecasting model combining scalable randomized Reproducing Kernel Hilbert Space (RKHS) methods for approximating Gaussian processes with autoregressive smoothing kernels in a regularized supervised learning framework. While the smoothing kernels capture the two main approaches in current use in the field of crime forecasting, kernel density estimation (KDE) and self-exciting point process (SEPP) models, the RKHS component of the model can be understood as an approximation to the popular log-Gaussian Cox Process model. For inference, we discretize the spatiotemporal point pattern and learn a log-intensity function using the Poisson likelihood and highly efficient gradientbased optimization methods. Model hyperparameters including quality of RKHS approximation, spatial and temporal kernel lengthscales, number of autoregressive lags, bandwidths for smoothing kernels, as well as cell shape, size, and rotation, were learned using crossvalidation. Resulting predictions significantly exceeded baseline KDE estimates and SEPP models for sparse events

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Recommended from our members

Self-guided mindfulness and cognitive behavioural practices reduce anxiety in autistic adults: A pilot 8-month waitlist-controlled trial of widely available online tools

Author: Bowler Dermot M
Bowler Dermot M
Flaxman Paul E
Flaxman Paul E
Gaigg S.
Gaigg S.
Haenschel Corinna
Haenschel Corinna
McLaven Gracie
McLaven Gracie
Meyer Brenda
Meyer Brenda
Rodgers Jacqui
Rodgers Jacqui
Roestorf A.
Roestorf A.
Shah Ritika
Shah Ritika
South M.
South M.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Anxiety in autism is an important treatment target because of its consequences for quality of life and wellbeing. Growing evidence suggests that Cognitive Behaviour Therapies (CBT) and Mindfulness-Based Therapies (MBT) can ameliorate anxiety in autism but cost-effective delivery remains a challenge. This pilot randomized controlled trial examined whether online CBT and MBT self-help programmes could help reduce anxiety in 54 autistic adults who were randomly allocated to either an online CBT (n=16) or MBT (n=19) programme or a waitlist control group (WL; n=19). Primary outcome measures of anxiety, secondary outcome measures of broader wellbeing, and potential process of change variables were collected at baseline, after programme completion, and then 3 and 6 months post-completion. Baseline data confirmed that intolerance of uncertainty and emotional acceptance accounted for up to 61% of self-reported anxiety across all participants. The 23 participants who were retained in the active conditions (14 MBT, 9 CBT) showed significant decreases in anxiety that were maintained over 3, and to some extent also 6 months. Overall, results suggest that online self-help CBT and MBT tools may provide a cost-effective method for delivering mental health support to those autistic adults who can engage effectively with online support tools

City Research Online

Stirling Online Research Repository (RIOXX)

WestminsterResearch

Stirling Online Research Repository

Evaluating distributional regression strategies for modelling self-reported sexual age-mixing

Author: Dadirai T
Eaton J
Flaxman S
Gregson S
Risher K
Wolock T
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 15/03/2021
Field of study

The age dynamics of sexual partnership formation determine patterns of sexually transmitted disease transmission and have long been a focus of researchers studying human immunodeficiency virus. Data on self-reported sexual partner age distributions are available from a variety of sources. We sought to explore statistical models that accurately predict the distribution of sexual partner ages over age and sex. We identified which probability distributions and outcome specifications best captured variation in partner age and quantified the benefits of modelling these data using distributional regression. We found that distributional regression with a sinh-arcsinh distribution replicated observed partner age distributions most accurately across three geographically diverse data sets. This framework can be extended with well-known hierarchical modelling tools and can help improve estimates of sexual age-mixing dynamics

arXiv.org e-Print Archive

Crossref

LSHTM Research Online

Directory of Open Access Journals

PubMed Central

Spiral - Imperial College Digital Repository

PriorVAE: encoding spatial priors with variational autoencoders for small-area estimation.

Author: Bhatt S
Flaxman S
Howes A
Mishra S
Rashid T
Semenova E
Xu Y
Publication venue: 'The Royal Society'
Publication date: 12/05/2022
Field of study

Gaussian processes (GPs), implemented through multivariate Gaussian distributions for a finite collection of data, are the most popular approach in small-area spatial statistical modelling. In this context, they are used to encode correlation structures over space and can generalize well in interpolation tasks. Despite their flexibility, off-the-shelf GPs present serious computational challenges which limit their scalability and practical usefulness in applied settings. Here, we propose a novel, deep generative modelling approach to tackle this challenge, termed PriorVAE: for a particular spatial setting, we approximate a class of GP priors through prior sampling and subsequent fitting of a variational autoencoder (VAE). Given a trained VAE, the resultant decoder allows spatial inference to become incredibly efficient due to the low dimensional, independently distributed latent Gaussian space representation of the VAE. Once trained, inference using the VAE decoder replaces the GP within a Bayesian sampling framework. This approach provides tractable and easy-to-implement means of approximately encoding spatial priors and facilitates efficient statistical inference. We demonstrate the utility of our VAE two-stage approach on Bayesian, small-area estimation tasks

PubMed Central

Spiral - Imperial College Digital Repository

Modeling and forecasting art movements with CGANs

Author: Flaxman S
Haddadi H
Lau D-H
Lisi E
Malekzadeh M
Publication venue: Royal Society, The
Publication date: 16/03/2020
Field of study

Conditional generative adversarial networks (CGANs) are a recent and popular method for generating samples from a probability distribution conditioned on latent information. The latent information often comes in the form of a discrete label from a small set. We propose a novel method for training CGANs which allows us to condition on a sequence of continuous latent distributions f(1), …, f(K). This training allows CGANs to generate samples from a sequence of distributions. We apply our method to paintings from a sequence of artistic movements, where each movement is considered to be its own distribution. Exploiting the temporal aspect of the data, a vector autoregressive (VAR) model is fitted to the means of the latent distributions that we learn, and used for one-step-ahead forecasting, to predict the latent distribution of a future art movement f(K+1). Realizations from this distribution can be used by the CGAN to generate ‘future’ paintings. In experiments, this novel methodology generates accurate predictions of the evolution of art. The training set consists of a large dataset of past paintings. While there is no agreement on exactly what current art period we find ourselves in, we test on plausible candidate sets of present art, and show that the mean distance to our predictions is small

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Probabilistic Analysis of Facility Location on Random Shortest Path Metrics

Author: A Rényi
AD Flaxman
AM Frieze
CD Howard
G Cornuejols
HN Nagaraja
JM Hammersley
K Bringmann
M Shaked
R Davis
R Hassin
S Ahn
S Janson
SM Ross
T Aven
Publication venue
Publication date: 28/03/2019
Field of study

The facility location problem is an NP-hard optimization problem. Therefore, approximation algorithms are often used to solve large instances. Such algorithms often perform much better than worst-case analysis suggests. Therefore, probabilistic analysis is a widely used tool to analyze such algorithms. Most research on probabilistic analysis of NP-hard optimization problems involving metric spaces, such as the facility location problem, has been focused on Euclidean instances, and also instances with independent (random) edge lengths, which are non-metric, have been researched. We would like to extend this knowledge to other, more general, metrics. We investigate the facility location problem using random shortest path metrics. We analyze some probabilistic properties for a simple greedy heuristic which gives a solution to the facility location problem: opening the

\kappa

cheapest facilities (with

\kappa

only depending on the facility opening costs). If the facility opening costs are such that

\kappa

is not too large, then we show that this heuristic is asymptotically optimal. On the other hand, for large values of

\kappa

, the analysis becomes more difficult, and we provide a closed-form expression as upper bound for the expected approximation ratio. In the special case where all facility opening costs are equal this closed-form expression reduces to

O(\sqrt[4]{\ln(n)})

O(1)

or even

1+o(1)

if the opening costs are sufficiently small.Comment: A preliminary version accepted to CiE 201

arXiv.org e-Print Archive

Crossref

University of Twente Research Information